Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Regexp in for headers for perf #140

Conversation

baweaver
Copy link
Contributor

@baweaver baweaver commented May 19, 2023

I had noticed that Net::HTTP is splitting on a Regexp in the headers file, so wanted to put in a quick patch on that. Here's some of the performance data to back up this change:

require 'benchmark/ips'
require 'memory_profiler'

# Current method
def capitalize(name)
  name.to_s.split(/-/).map {|s| s.capitalize }.join('-')
end

# Enhanced method
def capitalize_new(name)
  name.to_s.split('-').map(&:capitalize).join('-')
end

DEMO_STRING = 'abc-def-xyz'

puts "capitalize(DEMO_STRING): #{capitalize(DEMO_STRING)}"
puts "capitalize_new(DEMO_STRING): #{capitalize_new(DEMO_STRING)}"

raise 'Not equal' unless capitalize(DEMO_STRING) == capitalize_new(DEMO_STRING)

Benchmark.ips do |x|
  x.report("Current capitalize") { capitalize(DEMO_STRING) }
  x.report("Enhanced capitalize") { capitalize_new(DEMO_STRING) }

  x.compare!
end

# Warming up --------------------------------------
#   Current capitalize    74.573k i/100ms
#  Enhanced capitalize   111.936k i/100ms
# Calculating -------------------------------------
#   Current capitalize    716.449k (± 2.9%) i/s -      3.654M in   5.104556s
#  Enhanced capitalize      1.100M (± 2.5%) i/s -      5.597M in   5.093632s

# Comparison:
#  Enhanced capitalize:  1099514.8 i/s
#   Current capitalize:   716449.3 i/s - 1.53x  slower

# Now for memory profiling

puts 'Current capitalize', '=' * 50, ''

MemoryProfiler.report {
  10_000.times { capitalize(DEMO_STRING) }
}.pretty_print

# allocated memory by gem
# -----------------------------------
#    6_480_000  other
# allocated objects by class
# -----------------------------------
#     100_000  String
#      20_000  Array
#      10_000  MatchData

puts '', 'New capitalize', '=' * 50, ''

MemoryProfiler.report {
  10_000.times { capitalize_new(DEMO_STRING) }
}.pretty_print

# allocated memory by gem
# -----------------------------------
#    4_400_000  other
# allocated objects by class
# -----------------------------------
#      90_000  String
#      20_000  Array

We could additionally hoist and freeze the header delimiter for an additional gain, but I wanted to keep this PR minimal and a constant being hoisted would introduce a potential public API surface for users to use.

My reasoning for doing this is that we have memory profiles from running Capybara tests that have this as a hot path in terms of memory and object allocations.

lib/net/http/header.rb Outdated Show resolved Hide resolved
@baweaver
Copy link
Contributor Author

Side topic: I've found a few additional areas where we might get some easy wins if you're up to spot check me on some more PRs later @byroot.

@byroot
Copy link
Member

byroot commented May 20, 2023

Sure. But note that I'm not the maintainer, I can't merge your changes. https://github.com/ruby/ruby/blob/master/doc/maintainers.md#libnethttprb-libnethttpsrb

@baweaver
Copy link
Contributor Author

baweaver commented May 20, 2023

Fair fair. I might ask @nobu, @hsbt, or @nurse if they'd be up to work with me on that as well. Going to go back through some of the memory profiles I have to see where other hot paths are. I know there are a few in Response as well.

@baweaver
Copy link
Contributor Author

baweaver commented Aug 3, 2023

Would one of @nobu / @hsbt / @nurse be up to take a look at this?

@@ -491,7 +491,7 @@ def each_capitalized
alias canonical_each each_capitalized

def capitalize(name)
name.to_s.split(/-/).map {|s| s.capitalize }.join('-')
name.to_s.split('-'.freeze).map {|s| s.capitalize }.join('-'.freeze)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not what you are asking for, but another implementation work trying could be:

def capitalize(name)
  name.to_s.gsub(/(\A|(?<=[\^\-]))([a-z])/) do |c|
    c.upcase!
    c
  end
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some benchmarking, and using the regex seemed to be slower.

Copy link
Contributor

@technicalpickles technicalpickles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been doing some benchmarking in this area too 😅

Using String with split is definitely faster than Regex. Freezing the string also can help, which I made a PR for over at #144

Some benchmarking I put together for this (including @byroot's gsub implementation): https://gist.github.com/technicalpickles/231940b1e64da1762df4a2e8fc53e1d8

@@ -491,7 +491,7 @@ def each_capitalized
alias canonical_each each_capitalized

def capitalize(name)
name.to_s.split(/-/).map {|s| s.capitalize }.join('-')
name.to_s.split('-'.freeze).map {|s| s.capitalize }.join('-'.freeze)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some benchmarking, and using the regex seemed to be slower.

@technicalpickles
Copy link
Contributor

@baweaver if you pull latest master down, you won't need to explicitly freeze now that #144 landed 🚀

@hsbt hsbt force-pushed the baweaver/performance/header-capitalize-regex-replacement branch from 96e5305 to 826e008 Compare May 30, 2024 09:04
@hsbt hsbt merged commit d39f1e3 into ruby:master May 30, 2024
15 checks passed
@hsbt
Copy link
Member

hsbt commented May 30, 2024

Sorry to late response. This PR helps to work with https://bugs.ruby-lang.org/issues/20205 in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants